Mon. Not. R. Astron. Soc. 000. ITHTT1 (2011) Printed 26 January 2013 (MN WF^i style file v2.2) 



Directed follow-up strategy of low-cadence photometric 
surveys in Search of transiting exoplanets - I. Bayesian 
approach for adaptive scheduling 
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ABSTRACT 

We propose a novel approach to utilize low-cadence photometric surveys for exoplan- 
etary transit search. Even if transits are undetectable in the survey database alone, it 
can still be useful for finding preferred times for directed follow-up observations that 
will maximize the chances to detect transits. We demonstrate the approach through a 
few simulated cases. These simulations are based on the Hipparcos Epoch Photometry 
data base, and the transiting planets whose transits were already detected there. In 
principle, the approach we propose will be suitable for the directed follow-up of the 
photometry from the planned Gaia mission, and it can hopefully significantly increase 
the yield of exoplanetary transits detected, thanks to Gaia. 

Key words: methods: data analysis - methods: observational - methods: statistical 
- techniques: photometric - surveys - planetary systems. 



1 INTRODUCTION 

The idea to detect transit s of exopla nets in the Hip- 
parcos Epoch Photometry ([ESAI [l997) trigerred several 
studies that checked the feasibil it y of such an attempt. 
IHebrard fe Lecavelier Pes Etangsl J2OO6) concluded that 
Hipparcos photometry did not look like an efficient tool 
for transit detection, without using any prior informa- 
tion. Indeed, some teams have mad e posterior detec- 



tions of the transits of HP 209458 (iRobichon Arenoul 



l200d : ICastellano et al.1 l200d: 
HP 189733 dBouchv. F. et al 



Soderhielml Il999h . and" of 
20051). a detection that 



IHebrard &; Lecavelier Pes Etangsl ([2006) confirmed. Those 
teams used the previously available knowledge of the orbital 
elements of the exoplanets (especially the period and the 
transit phase) , in order to phase the Hipparcos data of those 
stars. Using the large time span that had elapsed since the 
Hipparcos observations (for example, about 830 orbital pe- 
riods between Hipparcos ob servations and the observations 
that ICastellano et al.1 (|2000h used for HP 209458), the teams 
managed to drastically reduce the uncertainties of the known 
periods. The posterior detections of both transits prove that 
information about the transits does exist in the data, al- 
though it is obviously futile to try to detect t he transits in 
the naive Box Least Squares (BLS) approach (Ko vaes et al.l 
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2002). Thus the posterior detections motivated us to re- 
examine Hipparcos Epoch Photometry data and to look for a 
way to utilize this survey and similar low-cadence photomet- 
ric surveys, to detect exoplanets. The approach we propose 
here is to use the data to maximize the chances to detect 
transits during hypothetical follow-up campaigns, i.e., in- 
stead of attempting to detect a transit, we use the data to 
schedule follow-up observations that together with the old 
data set may enable its detection. 

In order to maximize the probability of sampling a tran- 
sit in those surveys, and in order to predict the best possible 
future observing times, we chose to use Bayesian inference 
methods. 

Bayesian analysis is based on Bayes theorem and can 
be written as: 



p(Hi\D,I) 



p(H t \I)p(D\H t J) 
p(D\I) 



(1) 



where p(Hi\D, I) is the posterior probability of the Hypoth- 
esis Hi, given the prior information, /, and the data, D. 

p(D\Hi,I) is the probability of obtaining the data D, 
given that Hi and / are true. It is also known as the likeli- 
hood function L(Hi). 

p(D\I) = ^2 i p(Hi\I)p(D\Hi, I) is a normalization fac- 
tor that ensures that ^2 i p(Hi\D,I) = 1. It is usually re- 
ferred to as the prior predictive probability for D, or the 
global likelihood for the entire class of hypotheses. 
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In the Bayesian framework we start from a prior knowl- 
edge we introduce into the prior probability distribution, 
p(Ho\I). The choice of prior distribution can affect the pos- 
terior distribution, especially if our observed data do not 
strongly constrain the model parameters. If our prior knowl- 
edge is poor, p(Ho\I) can spread over a wide range of possi- 
ble values for the model parameters. 

Whenever new data are available, it is possible to in- 
corporate the new data in our model through the likelihood 
function, combined with the prior, to obtain a new posterior 
density probability, p(Ho\Di, /), for the parameter. As soon 
as we obtain another set of data, D2 , we recalculate the pos- 
terior density probability in order for it to reflect our new 
state of knowledge. The possibility to combine new data sets 
into the original data we have will allow us to accomplish 
our goal of detecting transiting exoplanets using scheduled 
follow-up observations. 

IGregorvl (|2005aT ). iFordl (2006) and others have already 
shown that Bayesian inference is a useful tool for ana- 
lyzing precise radial velocity (RV) data of planet-hosting 
stars. IGregorvl (|2007T ) used Bayesian inference model se- 
lection for the problem of multiple planets, and IGregorvl 
(2005a) used it to construct posterior probability d ensity 
func tions of the light-curve p arameters. IPefav et al.l (|200lh 
and lAigrain &; Favatal ([20020 demonstrated the use of the 
Bayesian approach to study planetary transits. 

Our implementation of Bayesian inference is based on 
the Metropolis-Hastings (MH) algorithm, which is a version 
of the mo re general Mark ov- Chain Monte Carlo (MCMC) 
approach (|Gregory||2005br ) . 

A Markov chain is calculated using an initial set 
of parameter values, Xo, and a transition probability, 
p(Xn+i\X n , I), that describes the probability of moving 
from the current state to the next one. The transition proba- 
bility depends on the acceptance probability, described later 
in Section [3l and if properly constructed, then after exclud- 
ing the so-called "burning time" , we can use the chain as a 
sample from the desired distribution. MH Algorithm is an 
implementation of the MCMC that is used for obtaining a 
sequence of random samples from a probability distribution. 

The MH algorithm does not require good initial guess 
of the parameters values in order to estimate the posterior 
distribution. This is one of the most important advantages 
of the algorithm. The algorithm is capable of exploring all 
regions of the parameter space having significant probabil- 
ities (assuming it meets several basic requirements). The 
analysis also yields the marginal posterior probability dis- 
tribution functions for each of the model parameters, and 
their uncertainties. 

In Section [2] we describe the follow-up approach we de- 
veloped to detect transiting exoplanets, based on observa- 
tions from low cadence surveys. In Section [3] we give a brief 
review of Bayesian inference and its applications for our 
follow-up strategy. Sections |4] and [5] demonstrate the ap- 
proach by applying it on two stars that are known to harbor 
hot-Jupiters, HD 209458 and HD 189733, using the Hip- 
parcos data base. Section [6] shows some "sanity checks" we 
performed on data that do not contain the transit signal 
at all. We conclude and describe future applications of the 
strategy in Section [7] 



2 DIRECTED FOLLOW-UP 

Our ultimate goal is to detect transiting exoplanets using 
follow-up observations, which will be carefully scheduled to 
increase the chances to capture transits, should they exist. 
Thus, our approach does not focus on obtaining a detailed 
transit model that best fits the available data, but on build- 
ing a probability distribution function of the parameters of 
a simple model, based on these data. The transit model that 
we use in thi s work is a very simp listic one, based on the BLS 
philosophy (Ko vacs et aDbood V Thus, we model a transit 
light curve as a box-shaped transit with two phases, in and 
out of transit, and ignore the duration of the ingress and 
egress phases, as well as the details of the limb darkening. 
These details are less relevant in low-precision, low-cadence 
surveys, and using fewer parameters makes the model more 
robust. We use the Bayesian MH algorithm to obtain a pos- 
terior probability distribution for the model parameters, and 
then use this distribution to prioritize the timing of the ob- 
servations of the chosen stars for follow-up observations. 

The directed follow-up approach is not suitable for 
space missions like Corot or Kepler. Such missions, due 
to their high cadence, will not benefit from the approach 
since their phase and period coverage are already quite 
complet e. Instead, we aim for all-sky surveys like Hip- 
parcos jy an Leeu wen et al.1 Il997h , or its successor, Gaia 
( Jord i C. et al. I l2006h . in order to use their extensive low- 
cadence photometric databases for exoplanets search. 

A simplified (BLS-like) transit light curve is 
parametrized by five quantities, e.g., the period, phase, 
and width of the transit, and the flux levels in-transit and 
ex-transit. The first step in our proposed procedure is to 
apply the MH algorithm to the Hipparcos measurements of 
a target star. This results in five Markov chains that include 
the successful iterations for each one of the parameters. A 
successful iteration is one that was accepted by the MH 
algorithm. After removing the "burning time", each chain 
represents the stationary distribution of the parameters, 
which we use as their estimated Bayesian poste rior dis- 
tributi ons, for our current state of knowledge ([Gregory! 
l2Q05bh . Unlike the case of precise high-cadence surveys, 
even if the star does host a transiting exoplanet, due to the 
low precision and low cadence of the observations we do 
not expect the distribution to concentrate around a single 
solution, but rather show different periods that might fit 
the data, besides the unknown correct one. 

The next step of our procedure is to assign each point 
in time the probability that a transit will occur at that time. 
Calculating this probability is easy using the posterior dis- 
tributions we found in the first stage. Basically for time t, 
we count the number of MCMC successful iterations whose 
values of P, T c and w predict a transit in time t. Normaliz- 
ing this number by the number of total iterations yield the 
Instantaneous Transit Probability (ITP) for time t. If the 
ITP has significantly high values for certain times, then a 
follow-up observation is worthwhile at those preferred times. 

When we examine the ITP function of different obser- 
vations and simulations we performed, it is clear that the 
shape of this function when a transit signal exists contains 
sharp peaks, where the probability of transit is relatively 
high. This behavior is crucial for defining preferred times 
for follow-up observations. If we sample the values of the 
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predicted probabilities we can test for this behavior by the 
skewness of this sample. The skewness of a random variable 
is generally defined as 



{{x-{x)f) 



(2) 



Where (•) denotes the operation of averaging over the sample 
values. 

The skewness of the ITP is actually a measure of the 
amount of outliers, where the term outliers actually refers 
to peaks with significantly high values. The absence of such 
'outliers' means that there are no preferred times for follow- 
up observation; thus the ITP values will be more symmetri- 
cally distributed, with a skewness value close to zero (S = 
for a normal distribution). 

In order to prioritize the stars for follow-up observa- 
tions, we need to rank them. We propose to use the skewness 
of the ITP as a criterion for prioritizing stars for follow-up 
observations, together with the actual ITP values for follow- 
up times predictions. 

Another criterion we propose for the prioritization pro- 
cess is to use the Wald test for the posterior probability 
distribution of the transit depth, which is a measure of the 
'strength' of the signal we are looking for. The Wald test, 
named after Abraham Wald, is the most simplistic statisti- 
cal t est designed to examine the a cceptance of a hypothesis 
fe.g. iLvons &; Karagoz Unelll2006h . The Wald statistic for a 
random variable x is defined by the simple expression: 



W : 



E{x) 



std(x) 



(3) 



where E(x) and std(x) are the expected value and standard 
deviation correspondingly of the variable x, and xo is the 
nominal value of x according to the null hypothesis. In our 
case x is the transit depth, and the moments (E and std) are 
calculated based on the posterior probability distribution, xo 
is simply zero, since our null hypothesis is the absence of any 
transit. In a sense, the Wald statistic for the transit depth 
quantifies the degree to which we believe there is a peri- 
odic transit-like dimming of the star, based on the available 
photometry. A high value of the Wald statistic indicates a 
relatively narrow posterior distribution of the transit depth. 
This may indicate that there are periods according to which 
the low-flux measurements, corresponding to a transit-like 
dimming, are relatively concentrated in a short phase. This 
short phase can be the hypothetical transit which we look 
for. 

Performing the follow-up observations at the times di- 
rected by the previous step is the final step of the strategy. 
A combination of both the 'old' data from the survey and 
the new observations at the directed time eliminates periods 
that do not fit our new state of knowledge. The procedure is 
repeated until we detect a transiting planet, or exclude its 
existence. 

In Sections [4] and [5] we examine the strategy for two 
known transiting planets, HD 209458b and HD 189733b, 
using the Hipparcos photometric catalogue. The promising 
results show that the strategy is efficient in utilizing low- 
cadence low-precision surveys for exoplanets transit search. 



BAYESIAN APPROACH 
TRANSIT MODEL 



SIMPLIFIED 



Inspired by the BLS (|Kovacs et al.| [2002). our model is a 
simple box-shaped transit light-curve, with five parameters 
that characterize it: X = {P, T c , w, d, m}, where P is the 
orbital period, T c is the time of mid-transit, w is the transit 
duration, d is the depth of the eclipse and m is the mean 
magnitude out of transit. 

Let Vk and Gk denote the observed magnitude and its 
associated uncertainty at time respectively. Let m de- 
note the magnitude out of transit, which is assumed to be 
constant. 

Assuming a simple 'white' Gaussian model for the ob- 
servations, we can write down the likelihood function explic- 
itly: 



p(D\X) = Yl k 



exp 



Mfc) 



^n fc ^exp(- E ^) 



where 



m if tk is out of transit, 

m + d if tk is in transit, 



(4) 



and K is the number of observations. (Note that the mag- 
nitude during transit is defined as m + d, since we use mag- 
nitude units and not flux units). The exponent in equation 
(4) is actually half the well-known % 2 statistic. 

The MH algorithm can now be summarized by the fol- 
lowing description: 

1. Initialize Xo - the initial guess for the set of model pa- 
rameters; set n = 0. 

2. Draw a sample Y (trial state) from a proposal distribution 
q(Y\Xo). This distribution can be a Gaussian distribution, 
centered around X n - the current set of model parameter 
values. 

The acceptance probability, a, is defined by: a = 
min(l, r), where 



p(Y) P (D\Y) g(X n \Y) 
p(X n )p(D\X n ) q{Y\X n y 



(5) 



is the Metropolis ratio, which is composed from the 
prior x likelihood and the proposal distributions. If the pro- 
posal distribution is symmetric, then the second factor in 
the Metropolis ratio is equal to 1. 

3. Sample a random variable, t£, from a uniform distribution, 
in the interval — 1. 

4. If u ^ a set X n ^i=7 (a successful iteration), else set 

Xn-^-i—Xn. 

This last step results in accepting the new trial state Y 
with probability a. 

5. n = n + 1. 

6. Go back to step 2. 

Steps 2—6 are repeated N— 1 times to produce a Markov 
chain of length N. 

For a wide range of proposal distributions, q(Y\X n ), 
after an initial burn- in period (which is discarded), the al- 
gorithm creates samples of X n with a probability density 
function that covers the desired range, the posterior distri- 
bution, p(Xn,\D). 
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3.1 Choice of priors 

The choice of prior distributions is important in Bayesian 
analysis , as a non-educated c hoice can produce misleading 
results (jBalan fc Lahavll2009j ). 

For the o rbital period, we adopt the approach proposed 
bv lFordl i2006 N ) and use a uniform prior in log P, 



P (p) 



(6) 



Pin 

\ *min J 

The theoretical lower limit of the orbital period according 

to the Roche limit (^d « 2.423 x R s ^J~^j for a planet with 

m p ~ lOMjuy , orbiting a star with a solar mass, is approx- 
imately 0.2 d (For d Gregory! l2007h . while the upper limit 
can be set at arou nd 10 s d. or ab out three times the du- 
ration of the data ([Gregory! 2005aT ). Since the time-span of 
the data is long (for example, Hipparcos data spans more 
than a 1000 d), we choose an upper limit of the order of the 
time-span of the data, which is much longer than the period 
of any known transiting exoplanet. 

We use the same form of prior for the transit duration 



p(w) 



win (^ff^j ' 



where we chose, somewhat arbitrarily, a lower limit of w — 
0.001 d and an upper limit of 1 d. This range includes all 
known exoplanet ary transit durations. The other three pa- 
rameters of the transit model (T c , d, m) were assigned a uni- 
form prior, where T c and m have the range of the data as 
their upper and lower limits, and the transit depth, is uni- 
formly distributed between and 1. The prior distribution 
in our problem, assumping that the parameters are indepen- 
dent, can be described by 



p(X n ) = p(P)p(T c )p(w)p(d)p(m). 



(8) 



The most significant dependence expected between the or- 
bital parameters is between P and w, since the maximum 
value of w can be related to P. The strong dependence of w 
on the orbital inclination 'masks out' this correlation which 
is why we chose to ignore it in this work. In future appli- 
cations we may use more complicated priors, which might 
include dependence among the parameters. 

3.2 The Proposal Distribution 

We choose the proposal distribution for each parameter in- 
dividually. For T c , d and m we use Gaussian proposal distri- 
butions centered around the last value in the Markov chain. 
For the transit duration, w, which has to be between and 
around a few hours, we choose a lognormal distribution for 
the proposal distribution in order to avoid negative values. 

The period, P, requires special considerations. The 
structure of many kinds of periodograms shows that the like- 
lihood function, when seen as a function of the period, has a 
very complex structure of sharp local maxima and minima. 
Even for good-quality data, we expect the likelihood to have 
sharp peaks in harmonics and subharmonics of the correct 
period. We propose to use this shortcoming to our advan- 
tage, by using a "jumping" proposal distribution for log P. 
Thus, besides the small steps around the previous value of 



the Markov chain, we propose to allow, in some specified 
probabilities, jumps to a period which is a multiple or a di- 
visor of the current period. The probability to move to an 
integer multiple or divisor of the current value of the pe- 
riod is Prob = 1/10, while in random probability of 9/10, 
the moves are the usual ones around the current value of the 
Markov chain. This should allow a more efficient exploration 
of the parameter space. 



4 HD 209458 

We first apply the strategy on the Hipparcos Epoch Pho- 
tometry of HD 209458. Hipparcos observed HD 209458 (HIP 
108859) in non-uniformly distributed 89 epochs, in a time- 
span of about 1084 d. We use all the data points since 
they all have a quality flag ^ 2, which means they were 
accepted by at least one of the two data reduction consortia 
(jPerrvman ESAlll997h . The estimated standard errors of 
the individual H p magnitudes are around 0.015 mag, which 
is of the order of the transit depth (the signal we are looking 
for). 

In the Bayesian framework we choose priors for the pa- 
rameters as described in Section [3] and allow the algorithm 
to explore the parameter space in order to find the different 
solutions that fit the data. The resulting posterior distri- 
butions for the relevant parameters are shown in Fig. [T] As 
can be seen from the period histogram, the distribution does 
not favor any single period, but rather has several distinctive 
peaks. The most likely period is P ~ 3.52 d, which is consis- 
tent with the known period of HD 209458 (jCastellano et all 
2000), while the other probable periods are different periods 
that fit the data as well. 

We performed the Wald test to test the hypothesis of 
the presence of a planet that is transiting the star and found 
that the expected value of the transit depth posterior dis- 
tribution in our analysis is E(d) = 0.02 mag, and the value 
of the Wald test (equation [3]) is W = 4.22, a result with a 4 
a significance. 

We continue with the second part of the strategy- ex- 
amining the most probable time to observe the star in a 
follow-up observation. Our simulated directed follow-up ex- 
plores one year that began a month after the last observation 
of Hipparcos and found the best times to observe the star 
in order to sample the transit. The follow-up predictions are 
shown on the bottom-right panel of Fig. [1] (the ITP func- 
tion). We add the known trans it light-curve (based on the 
orbital parameters derived by ICastellano et all ([2000)) to 
the figure for comparison between the predictions and the 
actual transits. The time that was most preferred by our pre- 
dictions indeed fits inside a transit, meaning it would have 
been possible to detect the transit with only one follow-up 
observation conducted after Hipparcos using our proposed 
strategy. We examined the skewness of the ITP for prioriti- 
zation purposes and found it to be S = 1.4. 

As will be shown in Section [5] and also by Table [1] the 
ITP we obtained from Hipparcos data of HD 209458 is an 
exception, with a relatively low skewness, while other ITP 
skewness values for data that contain a transit signal are 
usually higher. We can understand this anomaly by looking 
at the ITP of the star in question: there are many peaks of 
the follow-up probability, and they are distributed over the 
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HD 209458 - Hipparcos : parameters histograms 



HD 209458 : ITP - 10 years after Hipparcos 
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Figure 1. Top: HD 209458 - Histograms of the posterior probabil- 
ity distribution functions of four orbital parameters found using 
the MH algorithm for the Hipparcos measurements of the star: 
the period P, time of mid-transit Tc, transit duration w and the 
depth of the transit d. Bottom: ITP function for the first year after 
Hipparcos observations, compared wit h the known tran sit light 
curve (orbital parameters derived using Castellan o et al.l (|2000l )). 
The significant peaks of the ITP fit mid-transit time; therefore, a 
single follow-up observation could have detected the transit. 



entire range we examined (one year post- Hipparcos), with 
significant values for follow-up (above 0.1). The relatively 
symmetric distribution of the ITP is the cause of the low 
skewness value. The significance of the ITP for the Hipparcos 
data of HD 209458 is still high enough for a follow-up obser- 
vation to be worthwhile, and together with the Wald statis- 
tic, the star would have gotten a high priority for follow-up 
observations, despite the relatively low skewness. 

We also check the follow-up predictions for 10 yr after 
Hipparcos, as shown in Fig. [2] Long after the last observa- 
tion, the ITP decreases, although some peaks remain, and 
when looking carefully, we can see that even three years af- 
ter Hipparcos, we could have detected the transit using our 
follow-up predictions. 

The final step of the strategy described in Section [21 
is to perform follow-up observations according to the most 
significant peak of the ITP. Since the time that has elapsed 
since Hipparcos cause the ITP peaks to be smeared, we can- 
not perform current follow-up observations for significant 
ITP peaks found using the algorithm, so we simulated such 
observations, and then combined them with the Hipparcos 
data, to recalculate the ITP. 

The simulation generated four observations (four single 
data points) inside and outside the predicted time of the 
transit (the significant peak of the ITP), with a typical error 
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Figure 2. HD 209458: ITP function for 10 yr after Hipparcos 
observations. The probability of sampling a transit smears as the 
time elapsed from the observations, but even three years after 
Hipparcos, a transit detection was possible. 



HD 209458 : parameters histograms - Hipparcos + simulated observation 
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Figure 3. HD 209458: combined data sets of Hipparcos and the 
simulated observation. The simulation generated four single mea- 
surements at the directed time by the ITP most significant peak. 
According to the known light curve of the transit this peak fits 
mid-transit time. Top: Histograms of the orbital parameters found 
using the MH algorithm for the combination of the data sets. Bot- 
tom: ITP for the first year after the simulation, compared with 
the known light curve of the transit. The transit detection was 
feasible, since all histograms are centered around the values of the 
orbital elements of the transiting planet: P = 3.5247 d, w = 0.1 
d and d = 0.022 mag ([Castellano et al.ll200()h . 



of 0.001 mag. In the simulation we used the known transit 
light curve to generate the observation. The new histograms 
for the combined data sets are presented on the top panel 
of Fig. [3] It is clear that the first observation that could 
have been preformed using the directed follow-up would have 
been enough to detect the transit, since the histograms are 
centered around the parameters of the planetary transit of 



6 



Y. Dzigan and S. Zucker 

Table 1. Wald statistic of the transit depth posterior probability distribution function and Skewness of the ITP 



Data/simulation 

First simulation of transit 
Second simulation of transit 
HD 209458 - Hipparcos 
HP 189733 - Hipparcos 

HP 209458 - Hipparcos and one simulated observation 
HD 189733 - Hipparcos and one simulated observation 
HD 189733 - Hipparcos and two simulated observations 
HD 189733 - Hipparcos and three simulated observations 
HD 189733 - Hipparcos and four simulated observations 
HD 189733 - Hipparcos and five simulated observations 
Noise alone 

First permutation of transit simulation 
Second permutation of transit simulation 
Third permutation of transit simulation 
HD 209458 - Hipparcos permutation 
HD 189733 - Hipparcos permutation 
HD 86081 (no transit) - Hipparcos 
HD 212301 (no transit) - Hipparcos 



Wald statistic of the transit depth Skewness of the ITP 
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HD 209458b. The simulated new observation exclude all the 
spurious periods which the MH algorithm proposed based on 
the Hipparcos data alone. The Wald statistic for the transit 
depth now increased to W = 13.46. As can be seen from 
Fig. [3] (bottom panel), the new directed follow-up that relies 
on both Hipparcos and the new simulated observation fits 
perfectly with the planetary transit of HD 209458b, and with 
a high ITP value, which is close to 1, and with ITP skewness 
of S = 4.8. 



5 HD 189733 

We next applied the procedure to the Hipparcos data for HD 
189733. Hipparcos observed HD 189733 (HIP 98505) in non- 
uniformly distributed 185 epochs, over a time span of 1083 d. 
We chose to use only 176 measurements that were accepted 
by at least one of the two data reduction consortia. The 
estimated standard errors of each individual H p magnitude 
are around 0.012 mag, which is of the order of the transit 
depth, similarly to HD 209458. 

The posterior distributions of the model parameters are 
shown on the top panel of Fig. 2] As can be seen from the 
figur e, the correct orbital period of the p lanet, P = 2.2185 
d, ([Hebrard &; Lecavelier Pes Etangsll2006h is not detected, 
and instead other periods fit the data. The preferred mid- 
transit time is T c - 2440000 = 8460.21 JD, while the actual 
time of mid-transit according to Bouchy et al. (2005) is T c — 
2440000 = 8460.11 JD. 

Again, we used the posterior distribution to perform 
the Wald test, to test the hypothesis of the presence of a 
planet that is transiting the star, although we obviously did 
not detect the correct period. The expected value of the 
transit depth posterior distribution is E(d) = 0.024 mag, 
and the value of the Wald test is W = 3.63, which indicates 
that follow-up observations are worthwhile since a transit is 
highly probable for this star. 

The follow-up predictions we have for a year that starts 
one month after the Hipparcos observations are shown on 
the bottom panel of Fig. [4] compared with the known tran- 



HD 189733 : Hipparcos - parameters histograms 
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Figure 4. HD 189733. Top: Histograms of the orbital param- 
eters found using the MH algorithm for the Hipparcos data: 
the period, time of mid-transit, transit duration and the depth 
of the transit. The periods that are most probable using the 
MH algori thm differs from the planetary period (P ~ 2.218 
d) (|Hebrard fe Lecavelier Pes Etangsl [2006). Bottom: ITP for 
the first year after Hipparcos observations, according to the 
procedure described in Section compared with the tran- 
sit light curve, derived using the kno wn orbital parameters 
(Heb rard fc Lecavelier Pes Et angs 2006). The significant peaks 
do not fit mid-transit time, hence a single follow-up would not be 
sufficient for transit detection. 
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HD 189733 : Hipparcos+1st simulated observation - parameters histograms HD 189733 : Hipparcos+2nd simulated observation - parameters histograms 
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Figure 5. HD 189733: Combined data sets of Hipparcos and the 
first observation simulation at the directed time. Top: Histograms 
of the orbital parameters. Bottom: ITP function. 



sit light curve of HD 189733b. This time a single follow- 
up observation would not have been enough to detect the 
transit, since the significant peaks in the follow-up do not 
match the mid-transit time. These results for HD 189733 
might have been caused by the star m icrovaribility which 
iHebrard &; Lecavelier Pes Etangsl ([2006) described. In their 
posterior detection of HD 189733b in Hipparcos they checked 
for long term periodicity, and found several significant peri- 
ods, that when removed improved the % 2 statistic they got 
for the known orbital period of the planet. From the high 
ITP of the follow-up predictions, its skewness (S = 4.7), and 
the result of the Wald test, this star would get high priority 
for follow-up observations, and although the observing time 
would not fit at the middle of the transit, as soon as a new 
observation in the preferred time would have been obtained, 
it would be added to the previous data we already have, 
and the procedure would be repeated, this time hopefully 
excluding the false periods. 

In order to check this claim we simulated a follow-up 
observation at the time the algorithm directed. Recall that 
in this case (as opposed to the case of HD 209458), the di- 
rected time was not in transit. We then added it to the Hip- 
parcos observation to recalculate the parameters posterior 
distributions, as well as to propose a new time for the next 
observation. We had to simulate a total of five follow-up ob- 
servations, each containing four "exposures" with an error 
of 0.001 mag, in order to finally detect the transit itself. In 
Figs [5j9] we present the changes in the posterior distribu- 
tion, as well as the directed follow-up predictions, as more 
simulated observations are added to the original data. Each 
observation eliminates some of the periods, making room for 
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Figure 6. HD 189733 : Combined data sets of Hipparcos, first 
and second observation simulations. Histograms of the orbital pa- 
rameters and ITP function. 



other periods to emerge, until an actual transit is observed 
at the final observation. By the fifth follow-up observation, 
the only probable periods left were the transit period and its 
multiples. Table [T] summarizes the Wald statistics and the 
skewness of the ITP for the original Hipparcos data, together 
with the simulated observations. 



6 'SANITY CHECKS' 

As we demonstrated above, it was possible to detect the 
transiting exoplanets orbiting both HD 209458 and HD 
189733 using the follow-up strategy we proposed. We now 
want to demonstrate cases where no transit signal is sup- 
posed to exist. 



6.1 HD 209458 - permuted data 

We perform the first test by randomly permuting the Hip- 
parcos data of HD 209458 and repeating the simulated 
follow-up procedure described above. The result of the Wald 
test, W = 1.45, shows that the likelihood of a transit for the 
permuted data is low. The predicted ITP for the directed 
follow-up is small for all the time-span that we checked, with 
a skewness value of S = 0.5, which again implied that there 
were no preferred time to observe a transit. This means that 
it is unlikely that there is a transit signal in the permuted 
data. 



6.2 HD 189733 - permuted data 

As for HD 209458, we randomly permuted the data of HD 
189733 in order to test our procedure. The Wald statistic for 
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HD 189733 : Hipparcos+3rd simulated observation - parameters histograms HD 189733 : Hipparcos+4th simulated observation - parameters histograms 
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Figure 7. HD 189733 : Combined data sets of Hipparcos, first, 
second and third observation simulations. Histograms of the or- 
bital parameters and ITP function. 

the permuted data is W = 1.5, which means that it is not 
likely to find a transit based on the permuted data. Although 
not as low as for the permuted data of HD 209458 the com- 
puted ITP in the directed follow-up, its skewness (S = 1.0), 
combined with the small value of the Wald statistic, means 
that it is not worthwhile to preform follow-up observations 
in search of transits. Fig. 1101 shows the comparison between 
the transit-depth histograms of HD 189733 and HD 209458 
and the histograms of the permuted data of both stars, com- 
bined with the follow-up predictions of the permuted data. 
The clear difference between the transit depth histograms of 
data that contain a transit signal and the permuted data, as 
expressed by the Wald statistic, is a good indication for the 
Wald statistic strength as a prioritization tool for follow-up 
observations. 



Figure 8. HD 189733 : Combined data sets of Hipparcos, first, 
second, third and fourth observation simulations. Histograms of 
the orbital parameters and ITP function. This time the actual 
period of the planet (P = 2.218574 d) is detected, and the most 
significant peak of the ITP fits the transit epoch. 

of the satellite, numbers which are similar to the number 
of Hipparcos measurements of HD 209458. We applied the 
strategy on both data sets. Fig. II II shows the corresponding 
depth histograms and the ITP function for both stars. HD 
86081 and HD 212301 did not show any significant value for 
the ITP, and the highest value was at least one order of mag- 
nitude below the predictions for HD 209458 and HD 189733. 
As a result, the skewness values of the ITP of both stars 
were low as well, (smaller then 0.5) which indicates that a 
follow-up observation is not worthwhile for those stars. The 
transit-depth Wald statistics for the two stars were W = 1.2 
for HD 86081 and W = 1.4 for HD 212301, which again in- 
dicates that transits are unlikely. 



6.3 HD 86081 and HD 212301 

Besides examining the randomly permuted data of HD 
209458 and HD 189733, we also applied our procedure on 
two stars for which we have reasons to believe there is no 
transit signal. We chose the two stars HD 86081 and HD 
212301, which are known to harbor short-period planets. 
Since the planets are known to have s hort orbital periods, 
the fact that no transits were detected ([Johnson J. A. et al.l 
120061 : lLo Curto G. et al.ll2006h , means it is very unlikely that 
the stars have other Hot Jupiters orbiting them, thus mak- 
ing them perfect targets for testing our procedure, as nega- 
tive test cases. Hipparcos observed HD 86081 for 71 reliable 
epochs, and HD 212301 for 123, during the operation time 



7 CONCLUDING REMARKS 

In this paper we proposed a novel approach to the design 
of follow-up observations of low-cadence photometric sur- 
veys, in a way that will maximize the chances to detect 
planetary transits. Examples of such surveys are Hipparcos, 
AS AS, and Gaia as Hipparcos successor. The strategy may 
also be beneficial for the Large Synoptic Survey Telescope 
(|juric fe Ivezidl201lh . and for Pan-STARRS ground-based 
survey, especially for directing follow-up observations of hot 
Jupi ters transiting M-dwarf stars in the M edium Deep sur- 
vey (|Dupuv fe Liiil2009l : lFord et alJl2008h . 

We tested our proposed procedure on two stars with 
transiting planets that were observed by Hipparcos during 
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Figure 9. HD 189733: Combined data sets of Hipparcos, and 
all five observation simulations. Histograms of the orbital pa- 
rameters and ITP predictions. Using the MH algorithm, the 
transit is found, along with the parameters that characterize it 
(P = 2.218574 d, Tc = 2453988.80331 (HJD), w = 0.0589 d, 
and d = 0.033 ma^ in the Hipparcos Hp s ystem, derived by 
iHebrard fe Lecavelier Pes Etangsl (120061 ) and I Winn J. N. et all 
(120071 )). All the new peaks of the ITP fits mid-transit time with 
high detection probability. 

transits: HD 209458 and HD 189733. We showed that with- 
out any prior information regarding the orbital elements of 
the planets, it was possible to use the available data base 
of Hipparcos to direct follow-up observations for both stars 
and thus detect the planetary transits in minimal observa- 
tional effort. This makes use of the fact that the Bayesian 
approach allows the inclusion of new data, that reflect new 
state of knowledge, in an easy and straightforward fashion. 

The Hipparcos examples we analyzed are only test cases 
to demonstrate the algorithm capabilities. Using Hipparcos 
in such fashion to detect planets is already impractical, due 
to the long time that elapsed since the completion of the 
mission. The effect of the elapsing time is clearly seen in 
the way the ITP decreases during 10 yr (Fig. [2]). We have 
shown that one year after Hipparcos, it was possible to use its 
data to direct photometric follow-up observations that could 
have detected the planetary transits in only one follow-up 
observation for HD 209458, and in five observations for HD 
189733. 

In cases where only the Wald statistic has high signif- 
icance, but the ITP is relatively low, we might recommend 
performing spectroscopic follow-up instead of photometric 
one, since RV search is less dependent on precise knowledge 
of the transit phase, because the goal is then to sample all 
phases of the orbit. Since Hipparcos observations were per- 
formed almost two decades ago, and due to the fact that the 



Figure 10. HD 209458 and HD 189733. Top: Comparison be- 
tween the transit-depth histograms for the Hipparcos data and 
for the permuted data, with the associated values of the Wald 
test. Bottom: ITP function for the permuted data sets. 



ITP lost its significance, it may be more productive to per- 
form RV follow-up observation for potential stars found in 
Hipparcos alone. We will explore this option in future work. 

Obviously, the procedure suggested here should not only 
be confined to the search for transiting planets, but can also 
be applied for searching other kinds of periodic variables, 
such as eclipsing binaries and Chepeids. This will probably 
require some modifications of the procedure and algorithm. 
In this context, it is important to mention a similar approach 
of adaptive scheduling, which Tom Loredo proposed for the 
purpose of optimizing RV observat ions. Th e appr oach, adap- 
tive Bayesian exploration (ABE; lLoredol (|2004l )), is much 
more general, attempting to optimize the information ob- 
tained by every additional observation for the purpose of 
estimating the parameters of the model behind the observa- 
tions. The formulation of our problem is much more specific 
and simple - we want to optimize our chances to 'catch' the 
transit using well- scheduled follow-up observations. While 
every RV measurement contributes in some way to the or- 
bital solution, the contribution of an individual photometric 
measurement to the transit solution boils down to the bi- 
nary question whether it is in the transit or not. Thus, ABE 
uses an elaborate merit function that quantifies the amount 
of information in the RV measurements. ABE can probably 
be applied to our problem as well, but we feel it would be 
redundant due to the simpler nature of the problem. We 
speculate that the two approaches would yield very similar 
results. 

Our experience shows that the MH algorithm and the 
ITP tend to find all possible periods that fit the data. Be- 
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Figure 11. HD 86081 and HD 212301: transit-depth histograms 
and ITP predictions as derived using Hipparcos data for both 
stars. The depth histograms are distributed over the whole range, 
with low Wald statistic for the transit depth. Together with the 
low significance of the ITP, the star would not be prioritized for 
follow-up observations. 

cause of the low cadence of Hipparcos measurements, some 
hypothetical models may fit the data simply because the 
"transits" occur during 'gap' intervals, when no observa- 
tions were made. Thus, the follow-up prediction function, 
when generalized to a broader model space of periodic vari- 
ables, will allow constructing a follow-up strategy that will 
complement the low-cadence observations in a way that will 
optimize the period coverage. 

MCMC methods, such as the MH algorithm, can be 
very demanding in terms of processing time. Therefore, im- 
proving the efficiency and automatizing the strategy in order 
to explore large data bases is crucial to its usefulness. Thus, 
we are examining the idea of reducing the amount of model 
parameters that the MH algorithm explores to three main 
parameters: the transit period, duration, and mid-transit 
epoch, while marginalizing over the other two parameters: 
the transit depth and mean magnitude out of transit. The 
marginalization will hopefully shorten the computing time. 
Another idea worth examining is using a BLS-like algorithm, 
which will scan the (P, T c ,u>) space and calculate the like- 
lihood of each configuration, from which it will build, in a 
Bayesian fashion, the ITP function. Such scanning is obvi- 
ously a compromise, since it is discrete and finite by nature, 
and the coverage of the parameter space may be lacking. 
However, the gain in computation time compared to a Monte 
Carlo approach might be worth the price. 

At this stage the simulations we have presented in this 
paper are a feasibility test, based on Hipparcos Epoch Pho- 
tometry. The encouraging preliminary results we present 



here lead us to believe that t he strategy can b e beneficial for 
Hipparcos successor, Gaia (|Ever et al l [2009). Gaia, whose 
expected launch is planned to 2012, will measure about a 
billion stars in our Galaxy and in the Local Group, and will 
perform , besides ultraprecise astrometry, also spectral and 
photometric observations. Gaia is supposed to improve on 
the accuracy of Hipparcos using larger mirrors, more effi- 
cient cameras and detectors and better software to reduce 
the data. In its photometric mission, Gaia will scan the 
whole sky, with a photometric precision of 1 mmag for the 
brightest stars, a nd up to 20 mmag at a magnitude of 20 
(|Ever et al.ll2009h . Gaia main exoplanets search programme 
is focused on detection through astrometric motion measure- 
ments. The strategy we propose here may be generalized to 
direct follow-up efforts of Cam's photometry, aimed to de- 
tect transiting exoplanets. 
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